-
Notifications
You must be signed in to change notification settings - Fork 85
Expose cuFile stream register API to Python #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose cuFile stream register API to Python #893
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
madsbk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might consider using something like the following for the CUDA Stream bindings
from rmm.pylibrmm.stream import Streamhttps://docs.rapids.ai/api/rmm/stable/python/pylibrmm/#rmm.pylibrmm.stream.Stream
Co-authored-by: Mads R. B. Kristensen <[email protected]>
Co-authored-by: Mads R. B. Kristensen <[email protected]>
|
I've changed the parameter name to |
madsbk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the raw_stream name.
For your suggestion, do you think we need to add Python RMM dependency to KvikIO, or add KvikIO's own stream wrapper on C++ and Python level? @madsbk
At some point, yes. Or use cuda-python's CUDA stream, or roll our own :)
|
That makes sense. I've created this ticket #895 . |
|
/merge |
|
The segmentation fault mentioned in the PR description is actually caused by a tricky bug in Cython stream casting, fixed by #904. |
#893 contains a tricky bug. The following is incorrect: ```python def stream_register(stream: int, flags: int) -> None: cdef CUstream cpp_stream = <CUstream>stream ``` `stream` is a Python object. The cast applies to the address of this Python object, instead of the underlying integer value. This surprisingly does not trigger a compile error. There are two ways to fix this bug: - Method 1: double casting. The additional cast to `uintptr_t` extracts the underlying integer value of `stream` correctly. ```python def stream_register(stream: int, flags: int) -> None: cdef CUstream cpp_stream = <CUstream><uintptr_t>stream ``` - Method 2: `uintptr_t` as type hint (used in this PR). `uintptr_t` is not just a type hint; it actually causes `stream` to be a C object. ```python def stream_register(stream: uintptr_t, flags: int) -> None: cdef CUstream cpp_stream = <CUstream>stream ``` For both method, `uintptr_t` must be `cimport`-ed into the source file. Authors: - Tianyu Liu (https://github.com/kingcrimsontianyu) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #904
This PR exposes cuFile stream register and deregister functions to KvikIO Python API.
The stream register and deregister is an optional, performance-tuning API. The users specify the precondition of the I/O parameters by setting the
flagsthat indicate when the I/O parameters become valid and whether they are page aligned.The added Python API can be unit tested using:
pytest ~/kvikio/python/kvikio/tests/test_async_io.py::test_stream_register_deregister -m cufile -vsxA known issue is that, under CUDA 12.9, which is the only setting tested, if the file path is not on a drive with full GDS support, the above unit test will pass but end with a segmentation fault. This is beyond the scope of this PR and is likely not an issue in KvikIO. Further investigation is a good-to-have.